-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use only explicit NVTX3 V1 API in CUB #1751
Conversation
3e5ea18
to
285792a
Compare
🟨 CI Results [ Failed: 55 | Passed: 143 | Total: 198 ]
|
# | Runner |
---|---|
154 | linux-amd64-cpu16 |
16 | linux-arm64-cpu16 |
16 | linux-amd64-gpu-v100-latest-1 |
12 | windows-amd64-cpu16 |
👃 Inspect Changes
Modifications in project?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
// Include our NVTX3 C++ wrapper if not available from the CTK | ||
# if __has_include(<nvtx3/nvtx3.hpp>) // TODO(bgruber): replace by a check for the first CTK version shipping the header | ||
# include <nvtx3/nvtx3.hpp> | ||
# else // __has_include(<nvtx3/nvtx3.hpp>) | ||
# include "nvtx3.hpp" | ||
# endif // __has_include(<nvtx3/nvtx3.hpp>) | ||
|
||
# include <cuda/std/optional> | ||
// Furthermore, we only support the NVTX3 C++ API V1 | ||
# ifdef NVTX3_CPP_DEFINITIONS_V1_0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: do you think it's less likely for NVTX version to change than for user to require explicit ABI version?
We discussed the idea that if NVTX3_CPP_REQUIRE_EXPLICIT_VERSION
is defined, we'd disable NVTX support on CUB end. This approach supposedly works when the version is changed on the user side.
This PR goes a different path of binding CUB to a concrete version of NVTX. To me, it seems unlikely that users define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION
, so the initial approach seems more compelling. It leads to us not disabling NVTX support on every NVTX version change. Disabling NVTX when explicit version is required also seems easier on the maintenance part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: do you think it's less likely for NVTX version to change than for user to require explicit ABI version?
I think so, but it's hard to say. Using the explicit version is the recommended practice by NVTX3 for header-based libraries. See here: https://github.com/NVIDIA/NVTX/blob/release-v3/c/include/nvtx3/nvtx3.hpp#L32-L38.
* Since NVTX3_CPP_REQUIRE_EXPLICIT_VERSION allows all combinations of versions
* to coexist without problems within a translation unit, the recommended best
* practice for instrumenting header-based libraries with NVTX C++ Wrappers is
* is to #define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION before including nvtx3.hpp,
* #undef it afterward, and only use explicit-version symbols. This is not
* necessary in common cases, such as instrumenting a standalone application, or
* static/shared libraries in .cpp files or headers private to those projects.
And it's not only about the user. CCCL could also be mixed with any other library using NVTX3 with explicit versioning. And we ship a fair amouint of libraries with and around the CTK.
If the NVTX major/minor version changes, users would get a warning so we can have another go at this issue when we have more information. We may further discuss this aspect though, since we would not want this warning to trigger forever in case a CCCL with a newer version of NVTX3 would be combined and shipped into the same CTK.
We discussed the idea that if NVTX3_CPP_REQUIRE_EXPLICIT_VERSION is defined, we'd disable NVTX support on CUB end.
I know, and I don't like that approach. It just feels like a usability bug to me. Imagine a user using CCCL and enjoying NVTX ranges in CUB. Then they add an unrelated third-party library, which either defines NVTX3_CPP_REQUIRE_EXPLICIT_VERSION
or the user decides themselves to switch to the explicit API to avoid conflicts, and suddenly all NVTX ranges in CUB are gone. If I was that user, I would file a bug report.
I just strongly believe there is a better solution here.
285792a
to
fec0a4e
Compare
I may have found the solution. It seems to me that newer (non-V1) versions of NVTX3 will continue to ship the V1 symbols regardless. So if CUB uses the explicit V1 API, we are covered in all situations. I opened the following issue to confirm this: NVIDIA/NVTX#96. Let's wait for the response. |
fec0a4e
to
24a02dc
Compare
🟩 CI Results [ Failed: 0 | Passed: 198 | Total: 198 ]
|
# | Runner |
---|---|
154 | linux-amd64-cpu16 |
16 | linux-arm64-cpu16 |
16 | linux-amd64-gpu-v100-latest-1 |
12 | windows-amd64-cpu16 |
👃 Inspect Changes
Modifications in project?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
24a02dc
to
ffa6e89
Compare
🟩 CI finished in 2h 54m: Pass: 100%/249 | Total: 3d 22h | Avg: 22m 44s | Max: 55m 21s | Hits: 75%/248441
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
🏃 Runner counts (total jobs: 249)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
The explicit V1 API is always available. See discussion here: NVIDIA/NVTX#96 Fixes: NVIDIA#1750
ffa6e89
to
a6bdb69
Compare
I had a small conversation with someone from the NVTX3 team and they confirmed that the V1 API of NVTX3 is intended to stay forever. So the approach taken in this PR should work for both, explicit and implicit API usage. This is the best solution I have for now and I believe it's better than the status quo we have on |
🟩 CI finished in 5h 51m: Pass: 100%/249 | Total: 4d 00h | Avg: 23m 10s | Max: 54m 07s | Hits: 74%/248564
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
🏃 Runner counts (total jobs: 249)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
This PR lets the CUB headers detect the available NVTX3 API C++ wrapper version and then only provide NVTX range functionality if V1 is detected. Furthermore, CUB programs against the explicit API version, which is available independent of whether the user uses NVTX3 in explicit or implicit API flavor.
This PR is an evolution of: #1688
Fixes: #1750